In this lab, you will work on the following concepts.
# portions of this lab were taken from Deep Learning with Python
# !pip3 install tqdm
import glob
import os
import random
import shutil
import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import *
from tensorflow.keras import regularizers
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
import numpy as np
from random import shuffle
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.preprocessing import MinMaxScaler
print(tf.__version__)
print(tf.config.list_physical_devices('GPU'))
2.9.1 [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Execute the neural style transfer algorithm on an Santa Clara University image to generate a GIF of the different generations of the picture.
https://keras.io/examples/generative/neural_style_transfer/
Example of style images are provided in the styles directory. These images usually produce good results in style transfer.
With neural style transfer.
Neural style transfer involves using a convolutional neural network to extract the content and style information from two input images, and then optimizing an output image to match the content statistics of the content image and the style statistics of the style reference image. The resulting image appears to be a combination of the two original images, with the content of the content image and the style of the style reference image.
We use the intermediate layers of the convolutional neural network to get the content and style representations of the image. From the network's input layer, the first few layer activations represent low-level features like edges and textures. As stepping through the network, the final few layers represent higher-level features—object parts
In the code from Keras, VGG19 network architecture, which is a pretrained image classification network was used. These intermediate layers are necessary to define the representation of content and style from the images. For an input image, we try to match the corresponding style and content target representations at these intermediate layers. The output image is gradually refined through multiple iterations until the loss is minimized and the desired style transfer effect is achieved.
The optimization process involves minimizing a loss function that combines the content loss and the style loss, and total variation loss which are calculated based on the feature maps extracted from the neural network. The style loss is defined using a deep convolutional neural network, consisting in a sum of L2 distances between the Gram matrices of the representations of the base image and the style reference image, extracted from different layers of a VGG19. It is to capture color/texture information at different spatial scales. The content loss is a L2 distance between the features of the base image (extracted from a deep layer) and the features of the combination image, keeping the generated image close enough to the original one.
The process of Neural Style Transfer:
Input Images: select the content image and the style image. The content image is typically a photograph, while the style image can be a painting or other type of artwork.
Preprocessing: Both the content and style images are preprocessed before being used in the neural network. This typically involves resizing the images to a standard size and normalizing the pixel values.
Feature Extraction: The next step is to extract the content and style features from the preprocessed images using a convolutional neural network, typically a pre-trained model such as VGG-19 or ResNet-50, which has been trained on a large dataset of images and has learned to extract features such as edges, shapes, and textures.
Style and Content Loss: Once the features have been extracted, the style and content information can be compared between the two images. This is done by calculating the style and content loss. The style loss is calculated by comparing the correlation of the features across different layers of the network between the style image and the generated image by calculating a Gram matrix that includes this information by taking the outer product of the feature vector with itself at each location, and averaging that outer product over all locations, while the content loss is calculated by comparing the feature maps of the content image and the generated image.
Optimization: The final step is to optimize the generated image by minimizing the total loss, which is a combination of the style and content loss. This is done using an optimization algorithm such as gradient descent, which iteratively adjusts the pixel values of the generated image to minimize the loss.
# wE trained in different computer and save out intermediate images.
# Below code is to generate gif file from intermediate images.
# Image 1
import imageio.v2 as imageio
import os
images = []
file_path = 'scu_style_transfer_1/'
filenames = sorted(os.listdir(file_path))
with imageio.get_writer('scu_style_transfer_1.gif', mode='I', duration=0.08) as writer:
for filename in filenames:
image = imageio.imread(file_path + filename)
writer.append_data(image)
# Image 2
images = []
file_path = 'scu_style_transfer_2/'
filenames = sorted(os.listdir(file_path))
with imageio.get_writer('scu_style_transfer_2.gif', mode='I', duration=0.08) as writer:
for filename in filenames:
image = imageio.imread(file_path + filename)
writer.append_data(image)
# Show style image and two GIF images
from IPython.display import Image, display
base_image_paths = ['scu_style_transfer_2.gif', 'scu_style_transfer_1.gif']
style_reference_image_path = '9ooB60I.jpg'
display(Image(style_reference_image_path))
# for base_image_path in base_image_paths:
# display(Image(base_image_path))